Difference between Kinesis Data Stream and SQS
What is the difference between Kinesis vs SQS?
Amazon Kinesis differs from Amazon's Simple Queue Service (SQS) in that it allows for real-time processing of streaming large data.
Kinesis Data Stream
AWS Kinesis Streams offers large-scale data input and real-time streaming data processing. It allows you to order records as well as read and/or replay them in the same order.
Kinesis Data Streams:
-
Data can be consumed many times
- Data is deleted after the retention period
- Ordering of records is preserved (at the shard level) – even during replays
- Build multiple applications reading from the same stream independently (Pub/Sub)
- “Streaming MapReduce” querying capability (Spark, Flink...)
- Checkpointing is needed to track the progress of consumption (ex: KCL with DynamoDB)
- Provisioned mode or on-demand mode
- Stores record for 24 hours by default and can retain streaming data for up to 365 days
- Can send stream records directly to services such as Amazon S3, Amazon Redshift, Amazon ElasticSearch, Splunk, AWS Lambda
- Kinesis Provision Shards at 1MB/s produce and 2MB/s consumer
AWS SQS
AWS Simple Queue Service (SQS) offers a robust, highly scalable serverless hosted queue for storing messages and delivering data between application components.
SQS:
- Queue, decouple applications
- One application per queue
- Records are deleted after consumption (ack/fail)
- Messages are processed independently for standard queue
- Ordering for FIFO queues (decreased throughput)
- Capability to “delay” messages
- Dynamic scaling of load (no-ops)
- Can configure message retention period from 1 minute to 14 days, default is 4 days
- Other services can be integrated through AWS Lambda
- SQS does batch at 3000msg/sec and 30,000msg/sec in high throughput mode
Scenarios to use SQS vs Kinesis
SQS can be used in Order processing, Image Processing, Auto-scaling queues according to messages. Buffer and Batch messages for future processing
Kinesis Data Streams can be used in: • Rapid collecting and processing of log and event data • Real-Time metrics and reports • Real Time data analytics • Gaming data feed • Complex Stream Processing • Data Feed from IoT